• 



■ 



L-5<> 



GRAPHICAL STATISTICAL METHODS 

IU W. K. Hitilon, A.M. LICK.. M.LK.K. 




Introduction 

BKOADLY sjx iikuiji. w» sire often concerned 
with cxunuumg the pr*ipert»cs >>f similar 
things, ui'l with forming siti idt a ■>( what 
• mild Ik . ! . • ■ — . a j- -I as .1 1 vpn .d ohji 1 1. \ t vpn <d 

•l>Ji ■ t • IlllJiJ Ik ill ill), 1! ,|s HID h|||, I) I, ||kl tin- 

iii.ijuril) itl n|i|iit^ in a fuul\ large sampti Xu 
|. -s 1 n 1 [ h »rt mi rs •j»iin i(|i ■ - ■ 1 r !■• wav individuals 

■ litt' r iT-.iti (In u|ii«.il mil, .mil what proportion 

■ ■I tli> total bavi 1. ,i«jiii.i!jK i Ihm prn|* nit - to 
1 In H [>u al mil 

< .. in r-ilK iliix ideas .hi torn lit I intuitively, 
.iinl 1 nniprisi (In ' 1 s|« r ii'in r ' of tin nlNrvcr, 
.in.l ijutti often sin li ideas an- dilheult tn ■!• tine 
iithl 1 ■ >i 11 r 1 1 1 j r j 1 > ,ii. tn anyone el.si 

'statistic!) is .i httlijii'l which is primarily 
. ..in . in. il willi 1 lassif\ Ulg, grouping anil ex.itmu 
uig il.it. 1 .iinl otters a language by whit h thi aho\ 1 
uli as rail '•>' MiiivrV'i. r>it . vontptf, tin- most 
probable v. din <>i .in observed ijtiaiititv H- . the 
tvjm al value) isi ailed tin Mnilr,' ami .1 constant 
wlni li is descriptive n( thi w,i\ 111 wIhi Ii indi- 
viduals fliLsii r round ih<- typical value is lollitl 

lli< ' Standard I ). Vl.ill'iil Likewise thvFi are 
rt tiognixi il ^raphii .1! methods ol presenting ami 
analysing 1l.1t. 1. sin h .is 1I1. Histogram and (>niw. 

which liavi been drsigin d t nvcy |>.n rn ular 

il. us un! allow cert a i Delusions in U drawn 

with a minimum ol lalhiur 

|i |i .<■> Ii.iv . U ■ It i|< \i |i'|h il 1 >| pnpul.it H ill -v ' "I 

individuals .iiul > 1 tli< ili-tt .liiitt'iii !•! individual* 
111 tin- |" ■[ 'ill. it,, in . i>t suinpl< mi| individual-. lr.>t 11 
a population ami ol I In ilt-ll tuition ol individuals 
111 ili' sample. It is n-.t surprising t<> hud. 
tint. I'll, tii.it .111 ideal pari tit jtojuil.iiniii 
lias In-ill 'iniiiiiv.il. ideal, tint is, in tin 

amount of informal which can in- inferred 

from tin- way individuals an distributed m 
th. population, and in (In- simplicity with 

Which It '.111 Im ilrsiliUil ,llld d< lm< d This 

ideal d stnlnitmn is kimwn ;ls the Normal 
))wtrd nit inn ' or tin Uaussian Distribution,' 
.mil is found in .ippr<>Mttiat> closely Ut man) 



distributions of nbs. rvahlt' ijuantities in n itun 

I In mam point ol ti stint; data t" see wlntrnr 
individuals follow a Normal Distribution, is 1 . 
know win iln r tin i-.vteltsive |H"cd.«- - tlnns, j -r 
to tin- id' .1! pi.j'tll.ll :■ 'ii. 1 .in lx .Ippllid willg ,im 
Confidence lo lie problem 111 hand I lit- simpK 
tin ins tli ii if ili, data is ii..rni.ilh' distributed 
111111 h labour is s.ivi d because i)i. Various prctlu 
tuns Ij.ivi lieen "tabulati-d and publinhili l>v 
theoretical worki rs tn thi Iti-ld If tin d;ttj 1- 
noi normal, one has tn maki oii< ■- nwu < .tit ula 
titms iind pndn tiiuis. 

II rr. . r attitude is tn dts ide what mf>inu.i- 

tiou 1-, reijutn d Irnin (In dita and win 1 her a 
kT.ijilin ,d solution is ail<i|iiit< ii ii-, t uh'ilni 

one has slltlii irtit data to justify .1 lIMtlll lliU test 

1'i.iiiirvi tittin..' malysis Often orti Imds that 
thi re in so fi«w individuals in thi > iiiipl< it -iiiie'i 
.Iisjnix.il, 1h.1t .1 ntrve titling analysis; w.ntid '«• 
rathi r .ib~iird. .itnl obvioUilv, il this i> >". .1111 
I. ihi'iir saving >wi m oi cunv fttttui; hk< 
pinl>at>ihu p<|"i) is 1 i|u,i)l\' iuir< li.ihli I 1 
Hit- n ,i>"ti pniluhilitv j t.i 1 - r is .1 sti.ir. for tin 
utm.irv . .1- th. t.iiiptitii.ii i<i tjsi tl f«>r viry 
small -itnpl's tr«iin unknown pomdatioiis is w rv 
RTral (Ui'iiu- ttt.s true that. if the <jat,i follows 
.1 1111r1u.1l distribution, the uraph on thi proin 
hihi. paper will K- a straight Inn , but it tn.iy not 
he t,'' in t. ills reah/iil how littl< 1- tin deviatnm 
in -in hin .ir ii v whit h 1 an In tok'rati d in a pr,n in d 
I"r an .i<.->imipiii'ii "1 iiorniahtv to U- 

pi-llllnt. .is Will In* shown I. Hi I 

I lii novelty Hid f.iM in.itiou of probability 
p.ipel tends to tihpsi' tin us ttiin. ,s of I hi 
ii'ilini'iil O^ive (from which it is d, nv.iil, md 
.in iui|H,i'ii understatiditiji of tin evolution ol 
jirnhahility pilper can lead to error* due t.i 
ttn nrrei t plotting, or to a wroio! interjiretanoti 
ol thniurve. Sunn (Itteiupl is in, ide, tin n lure. 
111 tin following notes, to provide ,1 bat kgroiiml 
of simple, graphical, statistn il ntcthtxls, -unl to 
deiiioustrate some in h r. nt limit. tttons of proh.i 
!>,ht\ ]..i|«. r 

Wmim I \'.nn> In mum i ,, 



\K^ 



&^ 



$&>& 



i--$o 



lite I li-t*i<.T.ini 

It) survi yiug .1 m,i« (if tiitm-Tic.il n-siilts mie 
would intuitiv* K group iderinc.il r« suits tog»«thcT. 
and \* rhaps arrange (In- resulting fjninjw in 
ascending nnk't r*( magnitude, A logiral < \f.-n- 
smn .if Hits idea woulil !*■ ft group results whit h 
Ml ktwuii definite boundaries, and rn arrani? 
these, groups in usrnidinK order id magnitude. 
In thk way thi' vast amount of detail would I* 
ill. tili< ni'Ti i omprehensibli . :md any significanl 
'hlfi rente between one group and another would 
become .i|ip.ir. nf h«* exampli ..upposc rd.it .1 
' ' "■< n i if lit |Ji ]>i[)mrtit was required to ret[Uisi 
lion suits for demobilized armed forces, and tr 
«,is i ss.nti.iJ iii conserve raw materials .uirl 
labour. , Tin |n.'lit. id would !■■ t'> d;sei irer how 

Ui.iliV sizes n| suits would he required, and how 

maris nf ■ .n It should Ik manufactured 

I'lir first step misfit (*- tn Take a sample nt 

•ti. n .it random, ami measure ih. rr individual 
If iL'hts tu Mi< nearest ii*i h, .is shown in < ohuinis 
•li .ind i.' | in I. till. I fS7»te therefore, that the 
li! of -i man recorded ^ ^m may artuallv 
In iiivwhen hvtween srim ,unl v s i ,, | 

lASt.K I 



i i/| 

■ ulhl In NuiiiIkt 

tlr.it. I ill 

Iii. li M. ii 



Stttiil- t 

• •) Men 

in I ir- tip 



'4l \' 

( UIIIIJ- 

Int.t) Mm 1.41 iv r 
in i .r.iip ",. Men 



■ 


| ' 








1 i 


s 


K 




i 






1 


i t 


M 


m 




i- 






' i 


i 








1 s. 


* M 


n '• 


m 









1 



•■■> , 



I 1 
I 



1 I 



(I I 



1 jl. 



• .1 



«•« 



«5 



.,s 



-<iip|K>M th.ii ihe dat.i is KT-mji-'d into broader 

i ism s. Li . sample, into iin-ii with heights 

Ik iwv. ii s'l in .mil ImiJ ui and betwi , n (mj in .mil 

'•[lm. and so nn, as shown in loliimn i \\ I'his 

. an J»- plotted a* shown in Fie. i (a), win n ..n h 

vSmruss Ksi.tNitR. LlECIKfitB 1949 



pillar ri).i.s.nts a grttup id tlata hv its height 
In ing proportional tu tin- number .•] nn n witluu 

Ihe gruup. .Old th.- edgk s ,,f Ihe pillar dellfie fh, 

group boundaries mi, h ., figim is known ,»„ ., 

Histogram, and e. s.ilual.le |,, r showing th, 

Mode' u| irn.si fre-|u. nt value, uhuh in th.s 

ts about (# in, and the dbpersiuii, or waiter, 

• itller side of the Mode. Ill a Word, it shotts th 

distribution of individuals in the sample. [In 
snuir problems it 1n.1v be rtton convenient to 
plot tin height of Histogram tnli.us proffirtional 
to tin percentage o| total o|... n.uions, as >;iv- n 
i" laliN t, rohimn 1 f .111. 1 shown m Ktg _j i 

I — 



ton 



I 



1« M « 



HI&rOWUM 



A 



II '3 \- • 



~& 



I 

i 

i 




"I'tHtf,-) 

lif; I Hi<.t.iftani tin,t(<ci.. ikmeinf >ht dtslnbuti-H 
hetfM ,'t 'jfi m.11, jn./rtr w-M.J 1/ ili.miiiiiiK 
{Mr pt,ip,»tl»n .'f a, m .-j** krigkh brtatfH uv 
'■«, in una '■ in 

The Opi\e 

In practice, the group bound. iru-s ol the Histo- 
gram are chusen arbitrarily so .is in make a 
presentable figure, and when as it is p. ssihle tu 



«o> 



-i 



O' 1 



tee what proportion of men have heights between, 
■ay. hftjifi an 1 6o,jjin, it is nut easy to %■« what 
proportion woufS have heights between 05 in 
will 07 in (say), or what proportion exceed 
70 in in height. 






1 






















a 










MIFOGMM 
















(•> 




* 








* 






















• 




















• 














1 
























• • 

• * 


• * 


■ 
• 


• 
• * 


* 






— - 





ci 



fr« 



I 




o« 01 01 

<lwi«.l (aA) 

I- iK j /><*/ iJiuffuwt, fliilngram and Ofit-t of tkr 
tm.iurtd rtwtUinc* ft a Mem \ turrenl mil. A under 
full-ttod ttntdthoitl, tAf'H'iHi" r*lritpui4i[it>n of ihe 
titremfs 

In solve this kind of problem it is better to 
compute t hf cumulative percentage of men, as 
shown in Table 1, column (5}. Fur example, 
■ • v,, ■( 1 he men had heights between 57J in 
and hoj in and 0.2% between 574 m and 03$ in. 
,uid so on Notice that no man is recorded with 
height Uss than 57jin, and that o.U*. of the men 
were encountered over the range 57 1 in to bo| in. 
therefore the cumulative percentages must be 
plotted at the edges of the class intervals, as 
shown m Fig. t (b). This curve is an " Ogive." 
and provides different information from the 



Histogram For example, the proportion of 

men having heights Jess than 07 in is 52% and 
that less than 0.1 in is tfr„, therefore the oropuf 
tion of men between heights 05 in and 07 m 1- 
the difference 

Tins means that if one standard design of sun 
is int> tnL (I to in men between 05 in anil 07 111 
in height, the number to be requisitioned should 
be ib". of the total, and so un. Likewise it 
seems hardly worth while making suits 111 built 
fur Statures less than about '«im or over 7 ( 111 
and these men could be titled individually 

One valuable property of the Ogive is that 11 
is fairly insensitive in tin shone in* class intervals, 
(histogram groups) and then fore smooths out 
the data which was coarsened by grouping for 
the histogram. 

This example has shown what valuable in forma 
tion can lie obtained from the application ■ ■] 
very elementary statistical tools and a little 
common sense, and then has been tm need to 
test the tlata for normality, or confuse uneseH 
with predictions b;u»cd oil the Normal Itistn- 
butioir 

The Ogive is particularly u*- till for analysing 
data whose typical value (Mode] tends to I* . u.<.. 
to an extreme value For example, in mspis Itng 
electrical switches, the switch resistance 
never in- less thaji the resistant <■ of the condlH (i-r* 
forming the switch juris, but alwavs m-.t* 

depending ujxm thi* condition of the contact 

surfaces In such a case the Histogram is 
unsymmetrical or "skew" Fig 2 (a) and tin 
Ogive shows the minimum re&'stanci quiti 
clearly, whtch is of course, a useful paratm tei 
for judging the quality of the d<-s gn 

If a technique is emji|"\>d so that tin < xtrenu 
points of tero and hhj".. are not plotted tin 
Ogive gives an extrapolated value for the iituu 
mum resistance which Is insensitive tu the 1 hoii 1 
of histogram class intervals, {pillar widths) 

This i-> a reasonable technique .to adopi 
because otherwise (he arbitrary choice ■ ! 'lass 
interval would arbitrarily fix the point of utti 
cumulative observations at the edge of the hrst 
pillar , whereas it is much better t>> l< t tin < nttn 
data weight the choice of this paint b) . \tn 
potation. Similar remarks apply to the too 
point It is very instructive t>> experiment 
with grouping data difft rentlv. and observe 
how slightly the Ogive i>> affirted 

l.raphiral (Grouping of Oats 

It saves a great deal i>f time and labour to 
record each observation directly on the graph 
paper to Ik' used for the Histogram, bv making 
a bold dot opposite the appropriate point on the 



Wiiiuit KKiixiii I'domii i'n 






it- issa. thus forming rudime ntarv columns of 
iits as observations are repeated. This method 
as the grr.it advantage that nui' can ice when 
utticii-nt data has been collected, as the dis- 
n hut ion of dots begins to suggest a detinue 
nriM It is then easy to mark the dots utf 
nti> suitable groups and construct a Histogram 
«M'il on the numluT of dots falling into each 
:ronp us shown in Fig. 2. (Where a dot falls 
m .1 group boundary, one half of an observation 
■hoitld <ouiit m each group.) This is called 
1 l>ot Umgram. and the in termed late step of 
(rawing the Histogram is not necessary for 
'instructing an Ogive, as the total number of 
tots encountered up to a given boundary can 
* seen directly. Likewise, by observing quan- 
ities to the nearest convenient unit, the dots 
i>rm clean columns and have the appearance 
if histogram pillars, which therefore, need not 
>• drawn 

\n other advantage of colli-ctnig experimental 
lata bv dot diagrams, is that it shows up any 
lt.lt in performance due to warming up of 
■quiphient, eti Thts Incomes apparent when 
iiif ti'n!- the columns of dots tint being oiled 
ill a random in. inner, but that ones hand is 
ifr. ulually moving across the page, as time goes 
in Th.s is most noticeable in tin resistance 
measurements mentioned earlier, and the solu- 
tion is to make several mdepi iident expert 



" 



■ 



i 



i 



ttm «i -H«s» :IHi.«'Km '■ l«« V* 

I I I I 

1 







H 



V'-V..- 



I 



\ .- ',',, ItTM Wh 



I ! w* n 



I «%.v 1—, 

If; V. •". .";.'■' !'"«• 

■ . aft, . *ft ;. 11 - 1 . , h I 

... > • 

. ,; j > :a 



II lata till 

,. - 

1. ■*•.! 

j ■ 1 di i at l' 






... 



1 



tnents. noting the tune from the instant of 
switching hi: at which the observation is made. 
(Obviously the switt h nuwt be given time to 
cool down between experiments and stncilv 
speaking, the observations should be made at 
about the same rate ] The data is ret orded 
as bold dots. f>r other marks, located at the 
appropriate time, as shown in Kig, }. aii<l 1011- 
stitutes a Scatter Diagram wh eh is intended 
to show whether one variable seems to di(>eiid 
on another. Clearly, 111 Fig. j. the value of 
switch resistance observed, depended, to a 
great extent on the relative time at which the 
observation was made. 

The method ttf treating the data is to dividi 
t hi- diagram into suitable vertical snips so that 
the dots enclosed » an (>• considered to havi 
occurred at roughly the smii relative turn 
Then each strip of data tan !»• treated as a 
normal dot diagram to form Histograms or 
Ogives as required I*rg, ] thus gives a very 
clear |>:cture of the performam > id (In switch 
l-'or example, about live minutes after switching 
on. the odds would tie even that the switch 
resistance would not lie greater than o ji mil 
because the Ogive shows that out of a large 
number of observations, one ran expect v> ". 
it 1 hi lie low this value and 50",, above like 
wisi\ the chance that the resistance would 
exceed 0.47 mil is about one in twenty, bcausc 

the Ogive shows that 

about iiS'u of a large 
iiuinl»T of obst rv.itiotis 
eould In expei till to In 
below this valut 

//ofi.^ojun jut/ 01 I lfiu§ 

■ 'I tkt mfil-:t**<{ H'-l-i- 

littttt "fa Ac.lt l 1 hi rr ttl 
<,Ulllh LtHti't fati'liHut 
ititfl/lOeMl. sbui'SHf! thr 

Hiith-d ■',' trulini; •Uu,! 




»liil't*[ (*!» 



»J p4 it at 
MUtTUtt (ail) 




i. ■.'»» r <«a> 



Wmli.ESB ENOttnUCK, hn ottu IU49 



i°\ 



Before leaving the subject or Dot and Scatter 
Diagrams, then- is one disadvantage- that should 
Be mentioned, namely that he-cause one can see 
the diagram taking shape, one is inclined to 
cheat, or shall we say, be biased in one's judg- 
ment, arid cast the dots into where one thinks 
they should go. A great deal of^self-discipline 
has to be exercised. 



I 

1 

* 

3 



! 



I 

i 



ts 



I j i-i i— r 



« >» ••• 



>» n 



'I '*! 




II 



r t . -t r r-r r r i-i I t I I I t- 
*0 *i ****** 10 11 I* '* 



NttUd (t») 



Fig. i Histogram unlk alltrmotivt uaiti omd ikt 

HtVOmbtltt* I'aptr r ),•[[# „/ tkt distribution of 

kttgktt of (09 mfPi tot /ur f-ig %). ikomutg tin 

corrret method of plotting point i at the tdga of 

the clasi inttntali. 

Arithmetical Probability Paper 

Probability Paper is simply a special graph 
pa jut for constructing Ogives, in which the 
vertical scale for '-Percentage of Total Observa- 
tions ' is stretched, at the extremities in a 
particular way. This scale is called the Prob- 
ability Scale because the Ogive represents a 



population of a large number of individuals from 
which one can deduce the odds of an observation 
falling above or below a particular value Lei 
it b<- emphasized at once that it is not essential 
to use special probability paper to do this 
indeed We haw seen in the previous Section thai 
the common Ogi ve can be used 

The prefix. ' Arithmetical ' means that the 
scale for the variable is linear, while with Logar 
ithmic Probability Paper, the scale for the variable 
is logarithmic ; the probability scale being the 
same as for the arithmetical paper- Obviously. 
one could use a logarithmic * ale for th> 
variable, in the construction of the common 
Ogive, if such a scale offered a clearer picture ><1 
the data. 

Host Ogives are S-shaped turves, and become 
difficult to read at the extremes If the scale is 
expanded over these sections, the readability ol 
the graph is increased, but this does not mean 
that the reliability of the data is increased. In 
exactly the same way, inspecting the position 
of an ammeter pointer with a powcrf ul magnifying 
lens does not increase the precision of a current 
measurement, because the calibration may not 
be reliable. This means that when Ogives ar< 
plotted on probability paper, one must b> 
extremely cautious about making use of extra 
polated information at rhe extremes of prob 
ability scales, say. outside the range ol 5% to 

At this stage it is convenient to emptiasizt 
the correct method of plotting data on probability 
j»aper. S.nce the curve is realty an Ogive, (he 
points corresponding to cumulative group. •! 
data must \»- plotted at tie boundar.es of ib< 
Histogram groups, as shown in bin 4. and not at 
the central values of the groups which, bv tin 
way, is a common error. Since then- is no sero 
or too".; on the probability scale, data for the- 
points must necessarily he omitted. 

If the plotted points fall exactly on a straight 
line, the data is normally distributed, because 

the probability scale has lx-en specially stretched 
to make this SO Naturally one dues not expect 
experimental points to fit a straight line perfectly 
but the limited extent of deviation allowabti 
may not be appreciated, and one should hi v. 
at least twenty points to plot before drawing 
any Serious conclusions about normality. 

This is clearly demonstrated in Ktg. 5, which 
shows data plotted on probability paper, from 
rectangular and triangular populations. Most 
people would feel justified in drawing a straight 
line through the point-, givi-n by either ol tht 
8 cell Histograms and might, therefore, be led 
erroneously to believe that both population" 
were normally distributed. 



I'*4 



WlBIl IS* Enc.INMM, lltlKMMfS I04'J 



L. - 3" 






i * * * ■> 

tit"! J "* 



t ■ > i: I 1 

■ T t • | ' • • 



-•-. 



j ~ - « . * -♦ . i «i - .^r . 

;»..,♦....„.,„.... ».. • j^" .... 

r "I ' | W ........ . V • ffi 

■»tl« )ffo I Ml • ff° ' • ■ ' ■ 

* * ' ' ■ ' ■ ■ • • 

j - Rj ■ •*• ; ; 

* 1 i 1 *s « "■ i *t~ * i 4 a ? s i i * * • i * " 



"» ■+" ■) f-l f—\, ..«!•■ . .- * t - - *» *• • • 

» V ......... H ). . . - i ■ M I . . 

»...- — •..«».. T - —• 



11 at » 1 



I i • i • > I - 
■ . 



."imin'i 



* r 



K : 




I " 



* -. «• 



» 



f * * ifr ' 

♦-•••»•■* 

«• • - „ • 



. . . ». 

' I ' 

> ...... 






i 1 . t 



• * I I I 




14 1* 



I ik s QfitVff, />/.■»! J i>W /■'•*0Al/(/l l'if-r. fni (i , Mi- jbi/ (.■ ..J/ //if. ^1./ in- ii/ i.KJ j rut « A /i.iip ft t /.i nfiultu 
and IriiintmLii pairnt f»>pulaitnmt , \h^irtttf thr «»niM departure tt.im lintmU: u-hri frw f<< m/i art at aitaHr 



* The data fur lb < i IK hi nuts to show the 

■ urnntc nf the extremity ■«.. which is clearly 

.iltirniei! With (J tills, so lh.il til practice it Is 
idvisahl' tn have at least twenty points wh< ti 
•i sting for normality Even then, the reeommcq 
ill <l attitude I" he adopted, when an ajipr.ixttii.it 
' Iv straight line is nbtaund, is that such ilata 
• •ulJ arise from a normal pupiilatioii. and not 
that ii don 

IIk exampli s shewn niav he cnnsiih-ri'*! r.iih' r 
wiili' deviations fr"in a normal pupulation. .in<l 
iihl, rati* thr limitt-tl si;nsitt\itv of ptnbaHiItty 
[>a[*r to (hlfiii'itti.iti- bftWl'CHl tiistrilnitioiis 
whi-n rrlativi lv fi-w puinls arc available. Our 
Ivattiri- worlh tlotinK IS th»- nlativi-lv liiK'h 
probability of thi- hr^i [xunt for thr ri-rtaiiKiilar 
ilistr.bulion compan <i with thr motr notm.tl 
tnan^utar our. fur rxanaplr, i^.")"o fur thr 
in taiifjular, against ,).r:„ for tin- triaogular 
This r< veals that thr former poiiits are cumuir. 
irmn a m t angular t v| •• ■ »1 diMrtbutiua lucauv 

■ |uite a h'Kli projHirtniM of the results are eii- 
i niiuti ti >1 fur i|uitt a small invasion into the 
edge of the data 



I njjrmlped l>»l» 

the ]hipu]arttv of probability |).i;» r is dttr 
in no small nieasuri-, to the rase with winch 
random individual observations tan l»- handU-il 
although ihi iummun Ogtvi? ran be usi-d hi 
exactly ihr- same way (This knowledge m*\ 
Save a lol M nine and lab -ui wht ii ^ujiulie-. n| 
the special ^raph pap* r are not available.) 

Lit us n insider the ease ■■! our lirsi ex.inipti 
tin heights of nun lo be htti-il wilh suits I hi 
Hivt.iLT.ini i if I i^' i has nnalysitl the results i»l 
tilo ludivnbi.il observations, and I he ijurstion 
is «lu tin r a s.iruplc i if say Ii n unln iduals could 

provtili' the same information. Obviously nut 
hut a fair id> a of the distribution can he obtained 
if certain assumptions are justified. 

The ftr>t assumption to lie made is that each 
observation has the same weight. In other 
words that each of the ten observations repre- 
sents about the s.imr ntiinlnTof men in the parent 
population id t»ia This means that each in 
dividual observation in the sample is assumed to 
ti pii-si-iit 61. i) men or Io" u of the total men 
in the parent population. 



VAtitrilss I NI.ISOH l>f(iHbis Kai 1 / 



^03 



Ihe second assumption is that eai h sampie 
individual is typual of the to", of the parent 
population it is assumed to represent , that is. 
that th«' majttfity of the «>i .;* heights of mm 
'.in be considered tn be clustered round thr 
sample value 

\ third assumption (that, from previous 

■ '--pent-arc t>r other considerations, the data can 
In- expected to he normally distributed} is 
.aiuabJr, but nut essential, as tt means thai 
using probability paper, the ln-t Straight line 

■ an be drawn through thr- points, and so allows 
fairly small samples to be used. 

Phi- second assumption is the key to th<- 
method of plotting, because tt means that if the 
umpli individuals are arranged in ascending 
order nf magnitude, each in turn corresponds 
tu the mean value of successive groups of popula- 
tion. In other words, each sample individual 
is assumed to fall mar the centre of the group 
anil thcn-fore in the example chosen, has v , 
"f tin j ii ipul.it inn below it tu one boundary of 
it* croup, and v.. of the population above it 
to the oth< r boundary of the group In ap 
prrt.n hnik' the first sample individual then, 5",, 

of I lie total parent population would be ell 

countered, and in reaching the second, 15" , 
w..ii!p| have (*■ 11 passed, made up of the first 
group [io\] plus half of the next group t5%t. 
and so on. I liij ■ for ten individuals in th. sample. 
they should be plotted oil the Ogive, or oil 
probability paper, at the following percentages 
oi total observations 

5. *5. 35. 35. 45. 55. <» 7!S «5. 95%. 
In general, if there arr n observations in a sample, 
they should he spa. . it at Hx> n "., and tile first 
observation should occur at ion »*,.. Kxample 
supi>ose the height-, of ten men, taken at random. 
were as shown in Tabli II. and tliat it was known 
that a* normal distribution could I* esprcted 
Hie proportion 1 rf men with heights between 
ovn and o; m is required. 

I VHI.E II 



Ifi'l^lit. i.l ten min t.ikrii 
it random and )-!,,, r-.! m 
.im •-nilitiK "filer 

I 'lid tin I'riitmhiln* Si ait 
at. itn [ulluwiflH per- 
-entagtn 



Im \ti 



■ \l'-4 o^fi*. f_f. 






1 • 



<s s •< 






'.1 >*> -}t> 71 



«*5 s 



■>.Sn 



(he results are shown plotted on probab lit) 
pa|n r irt Kig fi and from the K st straight line, 
w> have ■ 



l'!o|Mirtiun of men ha-in^ heights less 

than _>7 111 
Proportion of men having hru>hts less 

lhan 05. 111 

Proportion of men haviug heights between 
67 in and 05m 



St*- 

is-, 

* 7*. 



„ — — 




i \ 


\ / 


1 N - -f "~i'l r 


J 


* «° ■■(-■' ,r 


fj ■ ~T ~ — _ '|3I" 


9 *° ~ i< 




Ml*MU***4Mnirij« 



HftCWtfin) 

hlj{ ti r»(lt. plotted ••« l'r.>:ll.ilU\ fjfrt Of a 
ymall mniplt n) trm tn.lit \,lujl >ft.>rn J(l.'»> (**»■ 
at *anrt-*m ftam Mr p>tpuiati>m t*t '*oj Ari^AI. >ff 
mm (Amen in Pqti 1 an J 4 

Ihe Normal Distribution 

Sii much has been said about (eating for 
normality, and whi tlu-r or not individuals ran 
be expected to follow a normal distribution, 
that a short discussion of us properties is 
appropriate at this stage 

As mentioned previously the Normal Ihs 
tn but ion is a conception in an ideal population 
of individuals, frum the |>oiitt of v,. » if matlie 
matnal analysts, ll is based on the kSKUmptlull 
that deviations of individuals, from the mean 

value, obey three laws 

(i) "Ihe prohabihtv of small deviations from 
the mean value is gTiat-i than the 
probability oi large deviation. 

(if) The probability of a certain deviation 

above the mean value is vtfaul to tlo 

jirohabilitv of an ec|ual deviatiua U-low 

the mean value. 

(3) The probability of a ' huge deviation 

i>< very small indeed. 

If we imagine a ' Normal ' Histogram, the ltr-,t 

law means that the central pi I Ears tend to peafc 

near thr mean , the second law means that lh< 

histogram is symmetrical about the mi an , and 

ih' third, that the columns vanish fairly quickly 

as the distance from the mean increases If we 



V-'lHELl-S Km.LWFJ I 'I ■,■_._- I.,. 






ihiw imagine the width of the histogram (xllurs 
to hi* made extremely small, and consequently 
the number of pillars e xtHa ncl) large, the tops 
i if the pillars would follow a t> ll-shapi d curve, 
as shown in Fig. 7. The mathematical law of this 
curve ha* been deducted from the postulati"iis 
above, and a knowledge of this allows a great 
deal to In' predicted about the probability id ati 
individual falling between specified boundaries. 
i r , into uiv specified Histogram group 

On* 1 of the most important deductions is that 
two param< lers , ( n sufficient to define and 
dr scribe the distribution . and both parameters 
can be calculated without drawing the dis- 
moot ion. or conversely, obtained directh from 
rbe Histogram or Ogive, without 1 al« ulation 
111' se parameters are thr Meal, and thi 

Standard IVviarmti 

ri(N an*- thi' distribution is svmmctru ad the 
Meari turrcsptaida to the peak I,/ the bell shaped 
curve I he Standard iH-v-iatmn is taken to be 
the distance from the Mean tit the point of 
tnflis turn of the i iirve. shown as sigma in I'lg 7, 
and both the*' points are easy to see on the 
Histogram. 

\s far as the Ogive is concerned, it has been 
<.ili ulateil that, for a Normal Distribution, i4 - 
■ >f the total population can be expectid Utween 
the mean and the standard deviation and. by 
symmetry, 50% of the po|*jlatio«i exists each 
*id< of the mean This means that the standard 
deviation can be found by taking the different c 
m abscissa corrvspi Hiding to ordinate* of v>* 
and tl S", (342% from the mean), or ordmatcs 
V>".'. ajid H4 r, as shown in Tig 7 

The Standard I deviation is used as a yardstick 
lor specifying the distribution of individuals in a 
normal |>opii]aiion. and tabltv arr published 
giving tlv« proportions of population encountered 
li twii n the mean and various deviations ■ * 
oressed as fractions or multiples of a standard 
deviation Table III is a simplified! version ufoni 

TABt.K til 



JVnenlAitr i»I Tula! I'opu latum 



1 let iat tun 


tieltteen 


Iv 1 wn 11 


Uutsulv 


fr*ttn Mean 


Mean anil 


Ivvia 


■ aniir <i( 




Ifc-viation 


lion 


- 1 >evia 


1 
|o (appro* 1 


1%) 


!%) 


ll«n i"„ i 


i\ 


JO 


Y> 


«r 


U * 


<*i 


U ? 


HI 


47 7 


VS 4 


4 " 


In 


41 ** 


<)>> 7 


1 



t i a. is approximately one in twenty t*' 1 
Likewise, the odds air about rven that an 
observation would I*- within the range ; |o 
(One should Ik- verv confident that the data is 
normally distributed, Induce venturing to predii I 
lor deviations greater than _\ <r.\ 

Perhaps the most important of all tleduittoiis 
made by theoretical workers, concerning Normal 
Distributions, is that whatever the distribution 
of individuals in a parent population (rectangular, 
triangular, double humped, or any other sha|n j ), 

the distribution of the means of random samples 
drawn from th.it population, tends to I. Normal, 
provided, of course, the panrtt population remain* 

Stable 

5! 

r 



1 

* 



\ 


\ 




1 


t . 


_ 


■■ 


1 



' «' S 

* "'S» 







1 "--'- 

l"r : -. , 

*4 "~J~T —' t 9 ' 



Tor example, one can v.-e that the chance of an 
11uhv.d11.1l observation falling outside the range 



1 • 7 1 . , .• 

Fig 7. Hittogram, and Probability Paptr Ognt. 
of a ' A/nrmaJ niun'mii ■■ ' ikemxng the nutkod 
0/ rt, of HtrtM.' tht paramtvn Mtom ' and 
Slundatd iJ«tia*iiJ« '. 

This last comment is the ki v to 1 >n- important 

application of I'robability t'ajnr, naou-ly to 
test the stability of populations. Successivi 
batches of twenty or mure n»ar:s i,f tampbs 
should show roughly the same mean and standard 
deviation if thr jMipulatton is stable This .-■ 
the basis of Quality Control, where inttabflit) 



MiuurM Km. 



I »r- inu> 1 -i 1 g 



4«7 



m the parent population of objects means that 
rejects will occur unless the drift is arrested 
When a sample is obtained with a mean vaJue 
deviating (mm the grand mean by a certain 
amount, one is justified in using Normal Dis- 
tribution Tables to compute the probability 
that such a deviation could be expected by pun- 
chance, and act accordingly ■ Notice, however, 
that it is not absolutely essential to understand 
Normal Distributions to operate a Ouahty 
t ontrol system ; one could conduct a 100% 
inspection at the beginning of pr oduc tion, when 
the process was >,|*-r.iting satisfactorily, and 
constnict an Ogive from the results This could 
then be used to dehne the limits within which 
subsequent sample individuals should fall. 

R( ' member that if one intends to use the 
means of samples of five individuals, the Ogive 
must Ik- constructed from the means of groups 
of hvc individuals. In short, decide on the sizc 
of sample, and commence with 100% inspection. 
I bis simply means that the samples are successive 
groups of individuals. Calculate the mean of 
each sample and plot as a dot diagram. When 
the dot diagram takes on a definite shape, plot 
the Ogive, Decide now how many false alarms 
can be tolerated, say one in ten, and select 
limit values from the Ogive accordingly In 
the example chosen, one is prepared to be given 
a false alarm once in ten times, in other words 
tO% of the total population Will be allowed to 
faJl outside the Ogive limit lines so that about 
once in ten times an observed mean will exceed 
the limits by pure chance. This means that the 
limits must be chosen to allow 5% of the popula- 
tion to occur below the lower limit, and y , 
above the upper limit. Thus draw lines on tin 
Ogive at 5*„ and (f$\ and read off the cones- 
ponding limits of the variable 

Take samples at reasonable intervals, and 
investigate as soon as any sample mean exceeds 
either of the limits. (If the sample means are 
plotted as a scatter diagram in time, and the 
limit lines are marked on the scatter diagram, 
it is easy to see when there is a drift in the pYu- 
■ ■ ss wlmh will eventually canst- rejects.) 

llus elementary system of (Quality Control 
is the basis of the more refined methods, which 
arc di s-gned to economize in labour by taking 
advantage of other theoretical deductions and 
properties associated with Normal Distribution- 

I onrlifdon* 

"ITie author hopes that the reader will agree 
that the first conclusion is that a great deal of 
useful work can lie done graphically, with 
Histograms. _ Ogives and Scatter Diagrams 



without any theoretical knowledge of Normal 
Distributions, and the like This is important 
because it encouragi sthe use ol efficient statistical 
nut hods among those wliei might be put off bv 
a more theoretical approach, and provides a 
good background for the appreciation of the 
more advanced ideas, later on, The danger uf 
a limited theoretical knowledge lies in the temp- 
tation to use apparently simple tools, whii.li havt 
been developed fmm verv advanced theory. 
because then they are imjwrfcttly understood. 
and erroneous conclusions may be drawn from 

the results. 

Probability Paper is directly concerned with 
Normal D.stnbu lions and this implies that some 
knowledge of the theory of these distribution* 
is desirable for its correct use. to nerally the 
use of it will be justified only when some applica- 
tion of the properties of normal distributions is 
sought, for example, when used as a labour- 
saving device to avoid calculations, and econo- 
mize in size of sample, when the data is known 
to be normally distributed. Or again, when 
testing the stability of populations, making use 
of the expected normal distribution of sampU 
means. 

The actual testing of data for normahtv 
should be regarded cautiouslv, and not seriously 
attempted with less than twenty points, prefer 
ably more. One should also have a clear id* 
whether a test for normahtv is justified I 01 
example, in the problem of providing suits 
for men of various heights, it is al>soIute non- 
sense to make elaborate tests and extensive 
predictions concerning the normality of tin 
heights, in order to avoid waste of material-. 
whin the selection is ultimately influenced by 
personal preferences in lolourv, designs and 
cuts. An Ogive analvsis would be adequati 
to get theprojKirtions approximately right. 

Likewise, as the author knows to his cost 
it is advisable to cheek the data for stability 
of parent population before attempting any 
serious curve fitting. 

The final conclusion drawn, is that Probability 
Patvr is rather an insensitive tool for discnminat 
ing between distributions, ami .1 large number of 
(Mtints are required before pronouncing judg- 
ment on the probable Parent imputation. 

Acknowledgments 

The author wishes to express his thanks to 
the Chief Scientist of the Ministry of Supplv 
for permission to publish this article, and to 
those friends and colleagues who first introdui et| 
the writer to this most fascinating and uselul 
subject. 



«u« 



Wl*«LI»s KsLixtm Dinyiu l«4V 






